Multiscale topology characterizes dynamic tumor vascular networks

Advances in imaging techniques enable high-resolution three-dimensional (3D) visualization of vascular networks over time and reveal abnormal structural features such as twists and loops, and their quantification is an active area of research. Here, we showcase how topological data analysis, the mathematical field that studies the “shape” of data, can characterize the geometric, spatial, and temporal organization of vascular networks. We propose two topological lenses to study vasculature, which capture inherent multiscale features and vessel connectivity, and surpass the single-scale analysis of existing methods. We analyze images collected using intravital and ultramicroscopy modalities and quantify spatiotemporal variation of twists, loops, and avascular regions (voids) in 3D vascular networks. This topological approach validates and quantifies known qualitative trends such as dynamic changes in tortuosity and loops in response to antibodies that modulate vessel sprouting; furthermore, it quantifies the effect of radiotherapy on vessel architecture.


This PDF file includes:
Topological data analysis Computational differences between datasets Supporting experimental information Tortuosity in the ultramicroscopy data Alternative results figures and statistical analysis Additional results and statistical analysis Table S1 Figs. S1 to S28 References Topological data analysis Homology and simplicial complexes. Persistent homology is based on the topological concept of homology (for intuitive introductions, see, for example, references (23,71); for more formal introductions see references (72)(73)(74)). Homology allows one to study shapes and forms disregarding any changes caused by stretching or bending. One can study the properties of a topological space by partitioning it into smaller, topologically simpler pieces, which when reassembled include the same aggregate topological information as the original space. Topological spaces can be very simple. Two trivial examples are the empty set X = ; or a space that consists of one single point X = {x}. If we want to capture the topological properties of the second example X = {x}, we could simply choose a single node to represent it. However, a node or even a collection of nodes does not allow one to capture the topological properties of more complicated spaces, such as a 2-sphere or the surface of the earth. In such cases, one needs a simple object that carries the information that the space is connected but also encloses a hole.
Consider, for example, a collection of triangles glued together to form a hollow tetrahedron; this is an example of a mathematical object called a simplicial complex. The building blocks that one uses to approximate topological spaces are called n-simplices which one can think of as generalised triangles. The parameter n indicates the dimension of the simplex. Every n-simplex contains n +1 independent nodes: a point is a 0-simplex, an edge is a 1-simplex, a triangle is a 2-simplex, and a (solid) tetrahedron is a 3-simplex. By using a numbering x i of vertices, we can write a 0-simplex as [x 0 ], a 1-simplex as [x 0 , x 1 ], a 2-simplex as [x 0 , x 1 , x 2 ], and a 3-simplex as [x 0 , x 1 , x 2 , x 3 ]. The lower-dimensional simplices form so-called faces of the associated higher-dimensional objects. One combines different simplices into a simplicial complex X to capture all different aspects of a topological space. Two simplices that are part of a simplicial complex are allowed to intersect only in common faces. The dimension of a simplicial complex is defined to be the dimension of its highest-dimensional simplex. A subcollection of a simplicial complex X is called a subcomplex of X if it forms a simplicial complex itself.
For every simplicial complex X we can define a vector space C n (X) that is spanned by its n-simplices with coefficients in the field Z/2Z. The elements of the vector space C n (X) are called n-chains. We can now define a linear map, the so-called boundary operator, between vector spaces C n (X) and C n 1 (X) which takes every n-simplex x to the (alternating) sum of its faces, i.e. its boundary: @ n : C n (X) ! C n 1 (X), i.e. in the j-th summand we omit the vertex x j from the vertices spanning the (n 1)-simplex.
Note that the sum in Equation (1) is over the field Z/2Z where ( 1) = 1, i.e. we can omit the ( 1) j term in the above equation. We can use the boundary operator to connect all n-chains of a simplicial complex X in a sequence, the so-called chain complex C = {C n , @ n }: . . .
We can represent a collection of edges that are connected to form a loop in a simplicial complex as a 1-chain, for example, [x 0 , x 1 ] + [x 1 , x 2 ] + · · · + [x j , x 0 ]. If we apply the boundary operator to this 1-chain, we obtain @([x 0 , In contrast, for a collection of edges that does not form a loop this is not the case, e.g., @([x 0 , Chains that are in the kernel of @ n , i.e. their boundary is zero, are called n-cycles.
One can compute that the composition of two boundary maps yields zero, i.e.
since the boundary of a boundary is empty. The image im @ n+1 of the boundary operator is therefore a subspace of the kernel ker @ n and its elements are called n-boundaries.
One can associate a family of vector spaces known as homology groups to a simplicial complex X based on its cycles and boundaries. For every dimension n 0 one defines the nth homology group as: In dimension 2, the elements of the homology group H 2 are called voids; in dimension 1, the elements of the homology group H 1 are called loops; in dimension 0, the elements of the homology group H 0 are called connected components. Two elements in H n are considered to be different, if they differ by more than a boundary, i.e. if they represent different n-dimensional holes. We then say that they belong to different homology classes.
We measure the number of n-dimensional holes of a simplicial complex by considering its nth Betti number n : The first three Betti numbers, 0 , 1 , and 2 , represent, respectively, the number of connected components, the number of 1-dimensional holes, and the number of 2-dimensional holes (i.e. voids) in a simplicial complex.
Persistent homology. While homology gives information about a single simplicial complex, PH allows one to study topological features across embedded sequences, so-called filtrations, of simplicial complexes, which can be constructed from data. A filtration (17,18,49) of a simplicial complex X is a sequence of embedded simplicial complexes, starting with the empty complex and ending with the entire simplicial complex X. The simplicial complexes in the filtration are connected by inclusion maps. One can now apply an important property of homology, functoriality: any map between simplicial complexes f i,j : X i ! X j induces a map between their n-chainsf n i,j : C n (X i ) ! C n (X j ) which induces a map between their homology groups f n i,j : H n (X i ) ! H n (X j ). In particular, this means that there exist maps between the homology groups of every simplicial complex in a filtration, e.g., there are maps that relate the voids, loops or connected components in simplicial complexes across a filtration.
One can visualise topological features such as loops or connected components across a filtration in a summary diagram called a barcode (49,79). For an appropriate choice of basis (80) of the homology groups H n , a barcode represents the information carried by the homology groups and the maps f n i,j : For topological features that persist until the last filtration step (and beyond), the persistence is said to be infinite.

Computational differences between datasets
The differences in the biology and the imaging of our datasets led to discrepancy in computational feasibility (see Table S1). In particular, the network sizes and penetration depth of the imaging differed considerably, which significantly affected the computations for the radial filtration. We first performed the majority of computations for the intravital data on a IBM System x3550 M4 16 core server with 768 GB RAM over 3 months but were not able to obtain all results. For the ultramicroscopy data as well as the remaining intravital data, we required a Dual Intel Xeon Gold 6240M 18 core processor system with 3TB of RAM to complete computations over further 3 months. While for the intravital data we were able to compute the radial filtration on all networks in the dataset (in some cases after reduction approaches for the number of nodes, see Data preprocessing in Materials and Methods), in the ultramicroscopy data, we were not able to obtain results for one of the control tumours on day 14 of observation despite reducing the number of nodes (see Section Data preprocessing in Materials and Methods).

Supporting Experimental Information
In order to visualize the response of the tumor vasculature, we used a transgenic mouse model in which the fluorescent protein tdTomato is expressed in both normal and tumor endothelial cells (EC). We used transgenic mice bearing a Cre recombinase-tamoxifen receptor fusion protein (Cre-ERT2) driven by the VE cadherin promoter. These mice were crossed with Gt(ROSA)26Sortm9(CAG-tdTomato)Hze mice so that activation of Cre by tamoxifen resulted in EC expression of tdTomato (schematic shown in Figure S1A). For imaging purposes we only used mice with greater than 95% fluorescent EC. The segmentation of tumor blood vessels was based on the TECs expressing tdTomato. We used intravenous injection of Qdots 705 to distinguish perfused from non-perfused tumor vessels, i.e. vessels labelled with the infused Qdots and vessels not labelled with it. As further evidence, we note that no Qdot positive, endothelial negative vessels were identified. If the Qdots were in the lymphatics then they would have identified vessels not lined by vascular endothelium; this did not happen.

Tortuosity in the ultramicroscopy data
The tortuosity descriptor is defined as the ratio of the number of short bars ( 10% of maximal radius used in the radial filtration) in dimension 0 barcodes to the number of vessel segments.
We divided by the number of vessel segments to ensure the contributions are topological and are not masked by an increase or decrease of vasculature. However, in the case of bevacizumab in the ultramicroscopy data, the significant decrease in number of vessel segments (8), which is also visually apparent when looking at examples of extracted vascular networks (see Fig. 2), leads to a seemingly contradictory increase of tortuosity (see Figure S2a). This is supported by a correlation between our tortuosity measure and the size of voids which we observed in this dataset (see Fig. S28). When considering the raw number of short bars without dividing by the number of vessel segments, we observe the expected effect of bevacizumab on tortuosity (see Figure S1: tdTomato expression in ECs and TECs in VE-TOM mice. A) Schematic of the Ve-Cad (Cre-ERT2) system. Administration of tamoxifen by gavaging in adult VE-TOM mice activates the Cre-LoxP system in endothelial cells inducing tdTomato expression. B) TECs expressing tdTomato (cyan), co-stained for CD31 (white). C) TECs expressing tdTomato (cyan), co-stained for CD31 (white). Representative image of a MC38 tumor. GFP positive tumor cells (yellow), TECs (cyan), and infused Qdots (purple) indicating perfused vessels. Scale bar in B and C is 50µm. Figure S2b).
As discussed in Section Computational differences between datasets, the vascular networks in the ultramicroscopy data are much larger than in the intravital dataset. However, they are less well resolved in the xy-plane (see Data description in Materials and Methods). This has two consequences on our analysis of tortuosity: 1) the tortuosity of vessels is likely to be captured to a lesser degree than in the intravital data, 2) the number of filtration steps needed to be able to capture tortuosity adequately would need to be significantly higher than the 500 used in the radial filtration. Indeed, example images from this data (see Fig. 2) do not appear to show strikingly tortuous vessels. Moreover, our computation of the radial filtration in 500 steps was already at the edge of computational feasibility (see Section Computational differences between datasets). Thus further refinement of the filtration is not possible. Alternatively, we can observe the coarse trends change over time without a normalisation by the number of vessel segments as shown in Figure S2b. While our topological descriptor therefore quantified a genuine and significant change in the vascular networks on the ultramicrocopy data, its interpretation here needs to be made with care.

Alternative results figures and statistical analysis
We show alternative representations of our results from the main text. In Fig. S3 we present the results for the intravital data as mean time series for each treatment group with error bars (standard error of the mean) to highlight that our data is dynamic over time.
In Fig. S4 we present our results including p-values from our statistical analysis. We compute the (non-exact and unadjusted) p-values for the intravital data using the R function pairwise.wilcox.test() in RSTUDIO (76) to perform a pairwise Wilcoxon's rank sum test between the control group and each of the treatment groups. For the ultramicroscopy data we use the function stat compare means() from the library ggpubr to perform Wilcoxon's rank sum test. All our tests are by default twosided.
In Fig. S5 we present time-series of the spatio-temporal resolution of the intravital data.  Figure S3: Topological descriptors extracted from tumour blood vessel networks treated with vascular targeting agents with known effects II. a) Intravital data results. We normalised all descriptors with respect to values on the day on which treatment is administered (day 0) or, for controls, the day on which observations commence (day 0). Data was collected from control mice (beige), mice treated with the vascular targeting agent DC101 (37) (dark pink), mice treated with vascular targeting agent anti-Dll4 (39) (light pink), mice treated with fractionated irradiation (FIR, brown), and mice treated with single dose irradiation (IR,blue). Tortuosity was computed as the ratio of short bars ( 10% of maximal radius used in the radial filtration) in the dimension 0 barcodes of the radial filtration to the number of vessel segments. Loops are the number of bars in the dimension 1 barcodes of the radial filtration per vessel segment in the network. b) Ultramicroscopy data results. Due to the snapshot nature of the data (one time point per tumour), all reported topological descriptors are raw values. Data was collected from control mice (beige) and mice treated with bevacizumab (purple). We computed tortuosity values and the number of vessel loops per vessel segment, in the same way as for the intravital data. We also determined the size of voids (avascular regions) by computing the median length of bars in the dimension 2 barcodes of the ↵-complex filtration.       Intravital data: single dose irradiation versus fractionated dose irradiation. We present statistical analysis on the control group and radiation treatment groups IR (single-dose irradiation) and FIR (fractionated-dose irradiation) in the intravital data. We use the function stat compare means() from the library ggpubr to compute Kruskal-Wallis test p-values for tortuosity (see Fig. S17) and number of loops per vessel segment (see Fig. S18). All values are normalised by day 0 of observation/treatment. Intravital data: all treatment groups. We present statistical analysis to determine whether at least one of the treatment groups in the intravital data behaves significantly differently to the others in Fig. S19 for our extracted tortuosity measure and in Fig. S20 for the number of loops per vessel segment. All values are normalised by day 0 of observation/treatment. We compute the (non-exact) p-values for the using the R function kruskal.test() to compute Kruskal-Wallis in RSTUDIO (76). We further present the same analysis for parameters not shown in the main text, i.e. for voids in Fig. S21 and maximal radii used in the radial filtration (i.e. an approximation of the tumour radii) in Fig. S22. Again, all values are normalised by day 0 of observation/treatment. We note that both of these parameters do not show significant differences between treatment groups. In the case of the voids in the intravital dataset this can be explained by the the low penetration depth of the imaging.  Finally, we present a correlation analysis between parameters that are conventionally extracted from vascular networks and our topological parameters in Fig. S23. We compute pairwise Pearson correlation using the library hmisc and plot our results including a complete linkage clustering dendrogramme of the parameters using the library corrplot in RSTUDIO (76).  We present the distribution of loops in the ultramicroscopy data relative to the tumour radii in Fig. S27. We apply a Anderson-Darling test using the function ad.test() from the library ksamples in RSTUDIO (76) to the different time points and treatment groups to determine whether the samples within one groups come from a common (unspecified) distribution. We do not find this to be the case in any of the groups for any time point.  Figure S27: Spatial distribution of the number of loops in the ultramicroscopy data. We show the distribution of loops in individual tumours grouped by treatment regime (top row: bevacicumab treated tumours; bottom row: control tumours) and time points (column 1: day 1 after treatment; column 2: day 2 after treatment; column 3: day 3 after treatment; column 4: day 4 after treatment). The horizontal axis represents the radial distance to the tumour centre normalised by tumour radius.
Finally, we present a correlation analysis between parameters that were extracted by Dobosz et al. (8) and our topological parameters in Fig. S23. We also include the number of segments and branching points determined by our extraction of the vessel networks with UNET-CORE (44).
Both of these standard parameters correlate strongly with the same parameters extracted by Dobosz et al. (8). We compute pairwise Pearson correlation using the library hmisc and plot our results including a complete linkage clustering dendrogramme of the parameters using the library corrplot in RSTUDIO (76).  Figure S28: Heatmap displaying the pairwise Pearson correlation coefficients between different vascular characteristics derived from the ultramicroscopy data. The dendrogramme represents complete linkage clustering using the Euclidean distance measure. We consider the following vascular characteristics: number of vessel segments as computed by (8) (segments old), number of branching points as computed by (8) (branching points old), number of branching points as computed by unet, number of vessel segments as computed by unet, number of vessel loops, necrotic tumour volume as computed by (8), tumour volume as computed by (8), vital tumour volume as computed by (8), maximal radius used in the radial filtration, number of vessel loops per vessel segment, median persistence of bars in dimension 2 barcodes (voids), number of short bars per vessel segment in the dimension 0 barcodes. We highlight the topological measures in orange including both the number of loops and number of loops per vessel segment to highlight the effect of the normalisation.